-
Notifications
You must be signed in to change notification settings - Fork 788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Enable proposer boost re-orging #2860
Conversation
3ebb5ac
to
647622c
Compare
This is blocked on the implementation of proposer boosting in other clients (namely Prysm and Nimbus). I tested this on Prater and attempted some re-orgs, but failed every time due to Teku and Lighthouse not having enough network share to out-vote the others. |
As @paulhauner points out, we should be careful not to try to re-org out blocks that have updated the justified/finalized checkpoint such that excluding them would not change the justified checkpoint. We could do this simply by not re-orging blocks at slot offset 31 in an epoch, or with some more complicated analysis |
For the record, if this pans out it would be an excellent addition to the honest validator spec so that it becomes expected behaviour for validators to penalise late blocks. |
This will help chains with non-standard slot times, e.g. Gnosis Chain.
Based on the currently running experiment I think this feature is working well. I've got 5K validators on Prater proposing late blocks, and another 25K validators running with So far ~59% of the re-orgs attempted have succeeded, and I think the failures are due to some validators on Prater not applying the boost. I'm in communication with the large Prater stakeholders to increase the number of boosting validators so we can confirm the behaviour on a more uniform network. Spreadsheet of all recently attempted re-orgs here: https://docs.google.com/spreadsheets/d/1Of-VB-MNoWmg1bM8gI6sRuef3kgxQD1pQzMRzwVp6II |
## Proposed Changes Mitigate the fork choice attacks described in [_Three Attacks on Proof-of-Stake Ethereum_](https://arxiv.org/abs/2110.10086) by enabling proposer boost @ 70% on mainnet. Proposer boost has been running with stability on Prater for a few months now, and is safe to roll out gradually on mainnet. I'll argue that the financial impact of rolling out gradually is also minimal. Consider how a proposer-boosted validator handles two types of re-orgs: ## Ex ante re-org (from the paper) In the mitigated attack, a malicious proposer releases their block at slot `n + 1` late so that it re-orgs the block at the slot _after_ them (at slot `n + 2`). Non-boosting validators will follow this re-org and vote for block `n + 1` in slot `n + 2`. Boosted validators will vote for `n + 2`. If the boosting validators are outnumbered, there'll be a re-org to the malicious block from `n + 1` and validators applying the boost will have their slot `n + 2` attestations miss head (and target on an epoch boundary). Note that all the attesters from slot `n + 1` are doomed to lose their head vote rewards, but this is the same regardless of boosting. Therefore, Lighthouse nodes stand to miss slightly more head votes than other nodes if they are in the minority while applying the proposer boost. Once the proposer boost nodes gain a majority, this trend reverses. ## Ex post re-org (using the boost) The other type of re-org is an ex post re-org using the strategy described here: #2860. With this strategy, boosted nodes will follow the attempted re-org and again lose a head vote if the re-org is unsuccessful. Once boosting is widely adopted, the re-orgs will succeed and the non-boosting validators will lose out. I don't think there are (m)any validators applying this strategy, because it is irrational to attempt it before boosting is widely adopted. Therefore I think we can safely ignore this possibility. ## Risk Assessment From observing re-orgs on mainnet I don't think ex ante re-orgs are very common. I've observed around 1 per day for the last month on my node (see: https://gist.github.com/michaelsproul/3b2142fa8fe0ff767c16553f96959e8c), compared to 2.5 ex post re-orgs per day. Given one extra slot per day where attesting will cause a missed head vote, each individual validator has a 1/32 chance of being assigned to that slot. So we have an increase of 1/32 missed head votes per validator per day in expectation. Given that we currently see ~7 head vote misses per validator per day due to late/missing blocks (and re-orgs), this represents only a (1/32)/7 = 0.45% increase in missed head votes in expectation. I believe this is so small that we shouldn't worry about it. Particularly as getting proposer boost deployed is good for network health and may enable us to drive down the number of late blocks over time (which will decrease head vote misses). ## TL;DR Enable proposer boost now and release ASAP, as financial downside is a 0.45% increase in missed head votes until widespread adoption.
## Proposed Changes Mitigate the fork choice attacks described in [_Three Attacks on Proof-of-Stake Ethereum_](https://arxiv.org/abs/2110.10086) by enabling proposer boost @ 70% on mainnet. Proposer boost has been running with stability on Prater for a few months now, and is safe to roll out gradually on mainnet. I'll argue that the financial impact of rolling out gradually is also minimal. Consider how a proposer-boosted validator handles two types of re-orgs: ## Ex ante re-org (from the paper) In the mitigated attack, a malicious proposer releases their block at slot `n + 1` late so that it re-orgs the block at the slot _after_ them (at slot `n + 2`). Non-boosting validators will follow this re-org and vote for block `n + 1` in slot `n + 2`. Boosted validators will vote for `n + 2`. If the boosting validators are outnumbered, there'll be a re-org to the malicious block from `n + 1` and validators applying the boost will have their slot `n + 2` attestations miss head (and target on an epoch boundary). Note that all the attesters from slot `n + 1` are doomed to lose their head vote rewards, but this is the same regardless of boosting. Therefore, Lighthouse nodes stand to miss slightly more head votes than other nodes if they are in the minority while applying the proposer boost. Once the proposer boost nodes gain a majority, this trend reverses. ## Ex post re-org (using the boost) The other type of re-org is an ex post re-org using the strategy described here: sigp#2860. With this strategy, boosted nodes will follow the attempted re-org and again lose a head vote if the re-org is unsuccessful. Once boosting is widely adopted, the re-orgs will succeed and the non-boosting validators will lose out. I don't think there are (m)any validators applying this strategy, because it is irrational to attempt it before boosting is widely adopted. Therefore I think we can safely ignore this possibility. ## Risk Assessment From observing re-orgs on mainnet I don't think ex ante re-orgs are very common. I've observed around 1 per day for the last month on my node (see: https://gist.github.com/michaelsproul/3b2142fa8fe0ff767c16553f96959e8c), compared to 2.5 ex post re-orgs per day. Given one extra slot per day where attesting will cause a missed head vote, each individual validator has a 1/32 chance of being assigned to that slot. So we have an increase of 1/32 missed head votes per validator per day in expectation. Given that we currently see ~7 head vote misses per validator per day due to late/missing blocks (and re-orgs), this represents only a (1/32)/7 = 0.45% increase in missed head votes in expectation. I believe this is so small that we shouldn't worry about it. Particularly as getting proposer boost deployed is good for network health and may enable us to drive down the number of late blocks over time (which will decrease head vote misses). ## TL;DR Enable proposer boost now and release ASAP, as financial downside is a 0.45% increase in missed head votes until widespread adoption.
I stumbled on this idea independently today when discussing with flashbots people how they (or mev hungry folks) might abuse block delivery times later than 4s to wait and gather more MEV. My answer was "if proposers at N are dishonest and release blocks late, proposers at N+1 are likely to be dishonest and reorg them using proposer boost". It's kind of a poor man's That said... I'd like to surface this to a larger conversation and consider it as a timeliness mitigation in the specs rather than this as optional behavior in a single client type. Done incorrectly, this can be dangerous. E.g., if you didn' have the strict rules about prev-slot and parent-prev-slot and we have >4s latencies, all blocks will be orphaned. I'm glad you did think this particular edge case through to bound the cascading behavior, but if we leave it to many different, disparate client implementations, we might get unexpected outcomes. |
I don't think you'll need this mitigation after we include the unrealized justified/finalized leaves fix discussed at devconnect. That is, any leaf in N should have it's justificaiton/finalization info be consider in N+1 based on the what would happn in that epoch transition regardless of if a block transitioned it into that epoch. |
Agree. I was planning to take Adrian's advice and raise a PR to the honest validator spec. Do you think the conditions I have at the moment are neat enough? They feel a little bit artificial, and perhaps trigger-happy validators will be tempted to loosen them, but that might be the best we can do.
I was thinking about this more yesterday, and I think it's probably still better to play it safe at the epoch boundary. The block in slot 31 might include just enough attestations to justify the epoch (while the block at slot 30 does not). If we build a new block at 32 on 30, then it will have a lower justified checkpoint than 31, which I think will make it inferior from fork choice's PoV, even with the newest changes? |
## Proposed Changes Mitigate the fork choice attacks described in [_Three Attacks on Proof-of-Stake Ethereum_](https://arxiv.org/abs/2110.10086) by enabling proposer boost @ 70% on mainnet. Proposer boost has been running with stability on Prater for a few months now, and is safe to roll out gradually on mainnet. I'll argue that the financial impact of rolling out gradually is also minimal. Consider how a proposer-boosted validator handles two types of re-orgs: ## Ex ante re-org (from the paper) In the mitigated attack, a malicious proposer releases their block at slot `n + 1` late so that it re-orgs the block at the slot _after_ them (at slot `n + 2`). Non-boosting validators will follow this re-org and vote for block `n + 1` in slot `n + 2`. Boosted validators will vote for `n + 2`. If the boosting validators are outnumbered, there'll be a re-org to the malicious block from `n + 1` and validators applying the boost will have their slot `n + 2` attestations miss head (and target on an epoch boundary). Note that all the attesters from slot `n + 1` are doomed to lose their head vote rewards, but this is the same regardless of boosting. Therefore, Lighthouse nodes stand to miss slightly more head votes than other nodes if they are in the minority while applying the proposer boost. Once the proposer boost nodes gain a majority, this trend reverses. ## Ex post re-org (using the boost) The other type of re-org is an ex post re-org using the strategy described here: sigp#2860. With this strategy, boosted nodes will follow the attempted re-org and again lose a head vote if the re-org is unsuccessful. Once boosting is widely adopted, the re-orgs will succeed and the non-boosting validators will lose out. I don't think there are (m)any validators applying this strategy, because it is irrational to attempt it before boosting is widely adopted. Therefore I think we can safely ignore this possibility. ## Risk Assessment From observing re-orgs on mainnet I don't think ex ante re-orgs are very common. I've observed around 1 per day for the last month on my node (see: https://gist.github.com/michaelsproul/3b2142fa8fe0ff767c16553f96959e8c), compared to 2.5 ex post re-orgs per day. Given one extra slot per day where attesting will cause a missed head vote, each individual validator has a 1/32 chance of being assigned to that slot. So we have an increase of 1/32 missed head votes per validator per day in expectation. Given that we currently see ~7 head vote misses per validator per day due to late/missing blocks (and re-orgs), this represents only a (1/32)/7 = 0.45% increase in missed head votes in expectation. I believe this is so small that we shouldn't worry about it. Particularly as getting proposer boost deployed is good for network health and may enable us to drive down the number of late blocks over time (which will decrease head vote misses). ## TL;DR Enable proposer boost now and release ASAP, as financial downside is a 0.45% increase in missed head votes until widespread adoption.
Oh I was thinking about this situation the other day and even made it a unit test |
I agree with this take, but I am worried that both this PR, as well as a proper (block, slot)-voting won't be able to prevent late block proposing / attesting. I wrote down my thoughts in a HackMD in response to a Flashbots GitHub issue. tldr:
It turns out that the above logic not only applies to this PR, but unfortunatelty also to (block, slot)-voting. As soon as proposers start releasing their blocks around the attestation deadline we have the potential to run into this vicious cycle that ends up with late block proposals / attestations. At least this is my worry/understanding at the moment, more details in the linked HackMD. Because of this I am starting to think more about how to incentivize timely block releasing specifically. Currently the block proposer is rewarded in proportion to the profitability of the attestations they include in their block. Instead we could try to also account for the proposer’s timeliness using some heuristic. One heuristic could be to scale the proposer’s reward by the share of same-slot committee votes that the block receives and are included in the subsequent block. At this point this is definitely more of an idea than a proposal. Also it won't be effective if MEV rewards are dominating timeliness rewards. Regardless, I think it can help keep the timing equilibria in line and incentivizing it specifically would make sense either way imho. A high level write-up can be found here. |
I'm marking this ready for review again 🎉 Conversation on the spec side has died down with seemingly everyone satisfied by the finalization check (the participation rate check has been axed for now). Most of the conversation occurred on the Ethereum R&D Discord: https://discord.com/channels/595666850260713488/595701173944713277/1049263250275061790, with Mikhail, Francesco, Potuz, JGM and Chris & Ruteri (Flashbots) all weighing in. I think we should merge the implementation in Lighthouse before the upstream spec merges, because the upstream spec is blocked on the unrealized justification/finalization spec for fork choice, but is otherwise ready. We can also tweak the implementation after merging if desired. On the builder side, it seems that BloxRoute are ready for late blocks arriving after 4 second and Flashbots are ready for late blocks arriving after 6 seconds. I've asked the Flashbots devs to update their cutoff to 4 seconds as well. I think incentives are sufficient that block builders will adopt these changes rapidly as the re-org strategy gains popularity on mainnet. In case of builder failure there's always the local fallback. I'm running the re-org strategy on all ~20K of Sigma Prime's Goerli validators. I've collected some successful re-orgs that used builder blocks (all BloxRoute for now) here: https://gist.github.com/michaelsproul/720459c93f20ee61911ab1732337f7ca |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice catch on the shuffling 👍 Let's do some re-orging!
bors r+ |
## Proposed Changes With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: ethereum/consensus-specs#2353 (comment) There is also a specification for this feature here: ethereum/consensus-specs#3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>
Build failed (retrying...): |
## Proposed Changes With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: ethereum/consensus-specs#2353 (comment) There is also a specification for this feature here: ethereum/consensus-specs#3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>
Build failed: |
bors retry |
## Proposed Changes With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: ethereum/consensus-specs#2353 (comment) There is also a specification for this feature here: ethereum/consensus-specs#3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>
## Proposed Changes With proposer boosting implemented (sigp#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: ethereum/consensus-specs#2353 (comment) There is also a specification for this feature here: ethereum/consensus-specs#3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>
With proposer boosting implemented (sigp#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. For the initial idea and background, see: ethereum/consensus-specs#2353 (comment) There is also a specification for this feature here: ethereum/consensus-specs#3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>
Proposed Changes
With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.
This PR adds three flags to the BN to control this behaviour:
--disable-proposer-reorgs
: turn aggressive re-orging off (it's on by default).--proposer-reorg-threshold N
: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.--proposer-reorg-epochs-since-finalization N
: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.For safety Lighthouse will only attempt a re-org under very specific conditions:
n + 1
rather than building on the block from slotn
we build on the block from slotn - 1
.Additional Info
For the initial idea and background, see: ethereum/consensus-specs#2353 (comment)
There is also a specification for this feature here: ethereum/consensus-specs#3034